AITopics | source separation

Collaborating Authors

source separation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction

Neural Information Processing SystemsJun-18-2026, 22:22:46 GMT

Unlike prior approaches constrained to fixed instrument classes, MGE-LDM learns a joint distribution over full mixtures, submixtures, and individual stems within a single compact latent diffusion model. At inference, MGE-LDM enables (1) complete mixture generation, (2) partial generation (i.e., source imputation), and (3) textconditioned extraction of arbitrary sources. By formulating both separation and imputation as conditional inpainting tasks in the latent space, our approach supports flexible, class-agnostic manipulation of arbitrary instrument sources. Notably, MGE-LDM can be trained jointly across heterogeneous multi-track datasets (e.g., Slakh2100, MUSDB18, MoisesDB) without relying on predefined instrument categories. Audio samples are available at our project page .

artificial intelligence, international conference, machine learning, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Overview (0.92)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

DeepASA: An Object-Oriented Multi-Purpose Network for Auditory Scene Analysis

Neural Information Processing SystemsJun-14-2026, 07:51:47 GMT

We propose DeepASA, a multi-purpose model for auditory scene analysis that performs multi-input multi-output (MIMO) source separation, dereverberation, sound event detection (SED), audio classification, and direction-of-arrival estimation (DoAE) within a unified framework. DeepASA is designed for complex auditory scenes where multiple, often similar, sound sources overlap in time and move dynamically in space. To achieve robust and consistent inference across tasks, we introduce an object-oriented processing (OOP) strategy.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.36)

Add feedback

ZeroSep: Separate Anything in Audio with Zero Training

Neural Information Processing SystemsJun-13-2026, 15:08:01 GMT

Audio source separation is fundamental for machines to understand complex acoustic environments and underpins numerous audio applications. Current supervised deep learning approaches, while powerful, are limited by the need for extensive, task-specific labeled data and struggle to generalize to the immense variability and open-set nature of real-world acoustic scenes. Inspired by the success of generative foundation models, we investigate whether pre-trained text-guided audio diffusion models can overcome these limitations. We make a surprising discovery: zero-shot source separation can be achieved purely through a pre-trained text-guided audio diffusion model under the right configuration. Our method, named ZeroSep, works by inverting the mixed audio into the diffusion model's latent space and then using text conditioning to guide the denoising process to recover individual sources. Without any task-specific training or fine-tuning, ZeroSep repurposes the generative diffusion model for a discriminative separation task and inherently supports open-set scenarios through its rich textual priors. ZeroSep is compatible with a variety of pre-trained text-guided audio diffusion backbones and delivers strong separation performance on multiple separation benchmarks, surpassing even supervised methods.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Isolating Nonlinear Independent Sources in fMRI with $β$-TCVAE Models

Li, Qiang, Yu, Shujian, Malo, Jesus, Liu, Jingyu, Adali, Tülay, Calhoun, Vince D.

arXiv.org Machine LearningMay-19-2026

Learning meaningful latent representations from nonlinear fMRI data remains a fundamental challenge in neuroimaging analysis. Traditional independent component analysis, widely used due to its ability to estimate interpretable functional brain networks, relies on a linear mixing assumption for latent sources, limiting its ability to capture the inherently nonlinear and complex organization of brain dynamics. More recently, deep representation learning methods have emerged as promising alternatives for modeling nonlinear latent structure. However, many of these approaches have been evaluated primarily on simulated datasets or natural image benchmarks, with comparatively limited validation on real-world neuroimaging data such as fMRI. In this work, we are motivated by the $β$-TCVAE (Total Correlation Variational Autoencoder), a refinement of the $β$-VAE framework for learning latent representations without introducing additional hyperparameters during training. We adapt and modify this model to fMRI data for nonlinear source disentanglement, aiming to separate mixed spatial and temporal brain signals into interpretable components. We show that the $β$-TCVAE framework can recover meaningful nonlinear spatial components with biological relevance, including well-established intrinsic connectivity networks such as the default mode network. Furthermore, we evaluate the learned representations using functional network connectivity, showing that the latent structure captures coherent and interpretable brain organization patterns. This study provides a pilot investigation that bridges nonlinear representation learning and fMRI analysis.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Machine Learning

2605.16708

Country:

North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)

Genre: Research Report > New Finding (0.94)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

51200d29d1fc15f5a71c1dab4bb54f7c-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 21:40:24 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario (0.28)

Industry:

Media (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

106b2434b8d496c6aed9235d478678af-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 23:09:49 GMT

artificial intelligence, diffusion model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.67)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(2 more...)

Add feedback

Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA

Aapo Hyvarinen, Hiroshi Morioka

Neural Information Processing SystemsApr-22-2026, 07:03:55 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, feature extractor, machine learning, (13 more...)

Neural Information Processing Systems

Country: Europe (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

StrEBM: A Structured Latent Energy-Based Model for Blind Source Separation

Wei, Yuan-Hao

arXiv.org Machine LearningApr-21-2026

This paper proposes StrEBM, a structured latent energy-based model for source-wise structured representation learning. The framework is motivated by a broader goal of promoting identifiable and decoupled latent organization by assigning different latent dimensions their own learnable structural biases, rather than constraining the entire latent representation with a single shared energy. In this sense, blind source separation is adopted here as a concrete and verifiable testbed, through which the evolution of latent dimensions toward distinct underlying components can be directly examined. In the proposed framework, latent trajectories are optimized directly together with an observation-generation map and source-wise structural parameters. Each latent dimension is associated with its own energy-based formulation, allowing different latent components to gradually evolve toward distinct source-like roles during training. In the present study, this source-wise energy design is instantiated using Gaussian-process-inspired energies with learnable length-scales, but the framework itself is not restricted to Gaussian processes and is intended as a more general structured latent EBM formulation. Experiments on synthetic multichannel signals under linear and nonlinear mixing settings show that the proposed model can recover source components effectively, providing an initial empirical validation of the framework. At the same time, the study reveals important optimization characteristics, including slow late-stage convergence and reduced stability under nonlinear observation mappings. These findings not only clarify the practical behavior of the current GP-based instantiation, but also establish a basis for future investigation of richer source-wise energy families and more robust nonlinear optimization strategies.

artificial intelligence, latent dimension, machine learning, (14 more...)

arXiv.org Machine Learning

2604.17381

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

StrADiff: A Structured Source-Wise Adaptive Diffusion Framework for Linear and Nonlinear Blind Source Separation

Wei, Yuan-Hao

arXiv.org Machine LearningApr-8-2026

This paper presents a Structured Source-Wise Adaptive Diffusion Framework for linear and nonlinear blind source separation. The framework interprets each latent dimension as a source component and assigns to it an individual adaptive diffusion mechanism, thereby establishing source-wise latent modeling rather than relying on a single shared latent prior. The resulting formulation learns source recovery and the mixing/reconstruction process jointly within a unified end-to-end objective, allowing model parameters and latent sources to adapt simultaneously during training. This yields a common framework for both linear and nonlinear blind source separation. In the present instantiation, each source is further equipped with its own adaptive Gaussian process (GP) prior to impose source-wise temporal structure on the latent trajectories, while the overall framework is not restricted to Gaussian process priors and can in principle accommodate other structured source priors. The proposed model thus provides a general structured diffusion-based route to unsupervised source recovery, with potential relevance beyond blind source separation to interpretable latent modeling, source-wise disentanglement, and potentially identifiable nonlinear latent-variable learning under appropriate structural conditions.

artificial intelligence, machine learning, trajectory, (14 more...)

arXiv.org Machine Learning

2604.04973

Country: Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.05)

Genre: Research Report (0.83)

Technology: